359 research outputs found
The variable-step L1 scheme preserving a compatible energy law for time-fractional Allen-Cahn equation
In this work, we revisit the adaptive L1 time-stepping scheme for solving the
time-fractional Allen-Cahn equation in the Caputo's form. The L1 implicit
scheme is shown to preserve a variational energy dissipation law on arbitrary
nonuniform time meshes by using the recent discrete analysis tools, i.e., the
discrete orthogonal convolution kernels and discrete complementary convolution
kernels. Then the discrete embedding techniques and the fractional Gr\"onwall
inequality were applied to establish an norm error estimate on nonuniform
time meshes. An adaptive time-stepping strategy according to the dynamical
feature of the system is presented to capture the multi-scale behaviors and to
improve the computational performance.Comment: 17 pages, 20 figures, 2 table
Re-Examining State Part C Early Intervention Program Coordinators’ Practices through a Positive Lens on Leadership: A Qualitative Secondary Analysis
Part C early intervention is a program administered under the Individuals with Disabilities Education Act (2004) that provides services to eligible infants and toddlers with disabilities and their families. Part C coordinators oversee the program in states. This article presents an examination of state Part C program coordinators’ leadership practices. We conducted a qualitative secondary analysis to explore the practices that Part C program coordinators described using in a prior study on the processes, barriers, and solutions during a systems change. The present study used two new theoretical frameworks – organizational drivers for systems change and a strengths-based orientation – to create a positive lens on leadership through which to view identified practices. We selected five interview transcriptions with five state Part C program coordinators that contained explicit reflections about leadership behaviors in systems as our primary data set. Five categories of leadership practice emerged from a progressive inductive-deductive coding process: meeting practitioners where they are, identifying leaders, establishing consistent procedures, readying professionals, and relationships. These themes aligned with organizational drivers of systems change and highlighted the consistent use of a specific type of leadership: facilitative administration. Implications for the study of systems leadership in early intervention are discussed
Slimmable Networks for Contrastive Self-supervised Learning
Self-supervised learning makes great progress in large model pre-training but
suffers in training small models. Previous solutions to this problem mainly
rely on knowledge distillation and indeed have a two-stage learning procedure:
first train a large teacher model, then distill it to improve the
generalization ability of small ones. In this work, we present a new one-stage
solution to obtain pre-trained small models without extra teachers: slimmable
networks for contrastive self-supervised learning (\emph{SlimCLR}). A slimmable
network contains a full network and several weight-sharing sub-networks. We can
pre-train for only one time and obtain various networks including small ones
with low computation costs. However, in self-supervised cases, the interference
between weight-sharing networks leads to severe performance degradation. One
evidence of the interference is \emph{gradient imbalance}: a small proportion
of parameters produces dominant gradients during backpropagation, and the main
parameters may not be fully optimized. The divergence in gradient directions of
various networks may also cause interference between networks. To overcome
these problems, we make the main parameters produce dominant gradients and
provide consistent guidance for sub-networks via three techniques: slow start
training of sub-networks, online distillation, and loss re-weighting according
to model sizes. Besides, a switchable linear probe layer is applied during
linear evaluation to avoid the interference of weight-sharing linear layers. We
instantiate SlimCLR with typical contrastive learning frameworks and achieve
better performance than previous arts with fewer parameters and FLOPs.Comment: preprint,work in progres
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models
Misalignment between the outputs of a vision-language (VL) model and task
goal hinders its deployment. This issue can worsen when there are distribution
shifts between the training and test data. To address this problem, prevailing
fully test-time adaptation~(TTA) methods bootstrap themselves through entropy
minimization. However, minimizing the entropy of the predictions makes the
model overfit to incorrect output distributions of itself. In this work, we
propose TTA with feedback to avoid such overfitting and align the model with
task goals. Specifically, we adopt CLIP as reward model to provide feedback for
VL models during test time in various tasks, including image classification,
image-text retrieval, and image captioning. Given a single test sample, the
model aims to maximize CLIP reward through reinforcement learning. We adopt a
reward design with the average CLIP score of sampled candidates as the
baseline. This design is simple and surprisingly effective when combined with
various task-specific sampling strategies. The entire system is flexible,
allowing the reward model to be extended with multiple CLIP models. Plus, a
momentum buffer can be used to memorize and leverage the learned knowledge from
multiple test samples. Extensive experiments demonstrate that our method
significantly improves different VL models after TTA.Comment: preprint, work in progress; project URL
https://github.com/mzhaoshuai/RLC
How to Unleash the Power of Large Language Models for Few-shot Relation Extraction?
Scaling language models have revolutionized widespread NLP tasks, yet little
comprehensively explored few-shot relation extraction with large language
models. In this paper, we investigate principal methodologies, in-context
learning and data generation, for few-shot relation extraction via GPT-3.5
through exhaustive experiments. To enhance few-shot performance, we further
propose task-related instructions and schema-constrained data generation. We
observe that in-context learning can achieve performance on par with previous
prompt learning approaches, and data generation with the large language model
can boost previous solutions to obtain new state-of-the-art few-shot results on
four widely-studied relation extraction datasets. We hope our work can inspire
future research for the capabilities of large language models in few-shot
relation extraction. Code is available in
https://github.com/zjunlp/DeepKE/tree/main/example/llm.Comment: SustaiNLP Workshop@ACL 202
Efficient Code Generation from SHIM Models
Programming concurrent systems is substantially more difficult than programming sequential systems, yet most embedded systems need concurrency. We believe this should be addressed through higher-level models of concurrency that eliminate many of the usual challenges, such as nondeterminism arising from races. The shim model of computation provides deterministic concurrency, and there already exist ways of implementing it in hardware and software. In this work, we describe how to produce more efficient C code from shim systems. We propose two techniques: a largely mechanical one that produces tail-recursive code for simulating concurrency, and a more clever one that statically analyzes the communication pattern of multiple processes to produce code with far less overhead. Experimentally, we find our tail-recursive technique produces code that runs roughly twice as fast as a baseline; our statically-scheduled code can run up to twelve times faster
- …